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ABSTRACT 

The failure of readability foraulas can be attributed 
to three weaknesses in the foraulas. First, they ignore or violate 
current knowledge about the reading process. Host foraulas affect 
onlT sentence length and word difficulty while ignoring factors that 
influence text coaprehensibility, such as cohesion, the nuaber of 
inferences reouired, the nuaber of iteas to reaeaber, coaplezity of 
ideas, rhetorical structure, dialect, and reguired scheaata. Hor do 
♦:heT account for reader-specific factors such as interest and the 
purpose for reading. Second, readability foraulas lack solid 
statistical grounding. The aost respected foraulas havebean 
validated by test lessons that were tntenHed only as practice 
exercises, never as aeasures of text coaprehensibility or as 
indicators of reading ability across age, class, or cultural groups. 
Third, readability foraulas are used inappropriately in two of the 
contexts in which they appear to be aost valuable. Even a f oraula 
with soae^ validity, used with appropriate texts and readers, cannot 
correctly predict hov a particular reader will interact with a 
particular book. (HTH) 
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Why Readability Formulas Fail 

Being able to measure the readability of a text with a 
simple formula is an attractive prospect, and many groups have 
been using readability formulas in a variety of situations where 
estimates of text complexity are thought to be necessary. The 
most obvious and explicit use of readability formulas is by 
educational publishers designing basal and remedial reading 
texts; some states, in fact, will consider using a basal series 
only if it fits certain readability formula criteria. 
Increasingly, public documents such as insurance policies, tax 
forms, contracts, and jury instructions must meet criteria stated 
in terms of readability formulas. 

Unfortunately, readability formulas just don't fulfill their 
promise. This failure can be attributed to three weaknesses in 
the formulas. From a theoretical point of view, they ignore or 
violate much of current knowledge about reading and the reading 
process. Second, their statistical bases are shaky, being at 
once poorly supported mathema»-ically and difficult to generalize. 
Finally, as practical tools either for matching children and 
texts or for providing guidelines for writers they are totally 
inappropriate. Criticisms such as these have been leveled at 
readability formulas from many quarters (Gilliland, 1972; Redish, 
1979; Kintsch & Vipond, 1977), but the formulas' uses have 
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expanded in spite of the growing number of papers discussing 
their weakness3S. We attempt here to categorize and summarize 
some of the problems with readability formulas and their use. 

Factors Not in the Formulas 

The first category of problem involves the discrepancy 
between the characteristics of texts which readability formulas 
measure and those which we know to influence text 
comprehensibility. Because most of the formulas include only 
sentence length and word difficulty as factors, they can account 
only indirectly for other factors that make a particular text 
difficult, such as degree of discourse cohepion, number of 
inferences required, number of items to remember, complexity of 
ideas, rhetorical structure, dialect, and background knowledge 
required. Further, because the formulas are measurements based 
cn a text isolated from the context of its use, they cannot 
reflect such reader-specific factors as motivation, interest, 
competitiveness, values, and purpose. 

Readability formulas fail to account for differences in 
readers' dialect and cultural backgrounds. For example, a 
passage in Black Vernacular from the Bridge series (Simpkins, 
1977) , a cross-cultural reading program, starts: 
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Willie went and got hisself a lightweight gig. The 
gig wasn't saying too much. It wasn't pacing nothing 
but chump change. 

Readers familiar with this form of Black Vernacular find the 
passage relatively simple, others can infer the meanings of 
individual words only with difficulty. 

Because they view texts so narrowly, readability formulas 
also fail to measure the effect of the context in which a passage 
is read. A health information sheet describing the concept and 
treatment of hypertension, for example, may communicate quite 
effectively if a patient has enough time to read it and feels 
comfortable asking a physician for clarification. In a rushed, 
brusque encounter, however, the document would be much less 
comprehensible . 

Lack of Statistical Basis 

Despite the shortcomings of readability formulas on 
theoretical grounds, strong empirical evidence of their 
predictive value might justify their use for some tasks. 
Unfortunately, when such evidence is examined, the second major 
problem with readability formulas— their lack of solijd 
statistical grounding — becomes apparent. Many of the hundreds of 
formulas in existence were validated only in terms of earlier 
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formulas. The early formulas, in turn, were validated using the 
McCall-Crabbs Standard Test Lessons in Reading (McCall & Crabbs, 
1950, 1961). But the McCall-Cr abbs lessons were intended only as 

practice exercise/^ in reading, never as measures of comprehension 

/ 

or text comprehensibili ty; nor were they intended to be general 
indicators of reading ability across age, class, or cultural 
groups. Nevertheless, the most respected formulas have all used 
the McCall-Crabbs lessons as the criterion of difficulty 
(Stevens, 1980) . 

Spache (1978), a readability formula designer, stated the 
problem succinctly: 

The reading level given by the formul;^ should mean 
that a child with that level of reading ability could 
read the book with adequate comprehension and a 
reasonable number of oral reading errors. This 
assumption has seldom if ever been tested in the 
development of this and other readability formulas 
(emphasis added) . 

While validation studies were vjenerally not performed in the 
course of developing readability formulas, a fair number were 
done after the fact. In a comprehensive review of such studies, 
Klare (1976) noted that 39 of 65 studies demonstrated a positive 
correlation between readability formula estimates of difficulty 
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and reader performance on independent criteria such as reading 
speed or comprehension. However r even this unconvincing 
performance is undercut by his observation that positive results 
are more likely to be reported in journals than negative ones and 
by the fact that when comprehension, rather than reading speed, 
is used as the independent measure of text difficulty, only half 
of the studies indicated positive correlations with readability 
formula estimates. Lockman (1957) computed Flesch Reading Ease 
scores for nine sets of instructions for psychological tests, 
then had 171 naval cadets rate them on "understandability. " The 
rank-order correlation between the two sets of measurements was 
-0.65, a strong correlation but in the wrong direction. 

Common sense also leads us to wonder how general izable 
readability formula estimates are beyond the precise situation in 
which they were validated. In 1978 Spache (1978) developed a 
revised version of his 1953 formula, saying. 

If a readability formula is to continue to reflect 
accurate estimates of the difficulty of today's books, 
it, too, must change. 

That is, a formula validated with one group of students and one 
type of texts is found to be invalid for the same types of 
students and texts as conditions change over a 25-year period. 
The effects on validity of the formula for readers having 
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different cultural backgrounds or dialects must be considerably 
greater . 

Inappropriate Use 

This leads us to the third general failing of the 
readability formulas: Their use is inappropriate in two of the 
contexts in which they seem most valuable. The first of these is 
the selection of an appropriate text for a child in school. Even 
if we assume the formulas have some limited validity and even if 
we are working with appropriate groups of texts and readers, we 
can never assume that the formula will correctly predict how a 
particular reader will interact with a particular book. 

For example, the book Don' t Forget the Bacon (Hutchins, 
1976) is a children's book that scores at grade level 2.7 using 
the Spache (1978) formula. It has mostly one syllable, easv 
words and short, simple sentences, e.g., "a pile of chairs?". 
Nevertheless, some children in fourth grade find it difficult to 
understand because the higher-level structure of the story is 
complex and subtle. The main character is a small boy given a 
verbal grocery list by his mother. Understanding the story 
depends on distinguishing between times the boy is rehearsing the 
list in order to remember it and times he is repeating the same 
list in order to figure out what went wrong. Because of this 
twist, the book may be more complex than its low score implies. 
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Relying on the formulas either to gauge the book's readability or 
a child's reading level could be worse than useless. 

A second major use for readability formulas has been as 
guidelines for the simplification of existing texts and 
documents. Here, too, using these formulas is inappropriate. 
Although they may, in certain cases, assign reasonable numerical 
values to texts, they by no means justify modifications of an 
existing text. Yet, in cases where readability formulas are 
used, writers naturally tend to write to the formulas. Such 
prescriptive use magnifies the inaccuracies inherent in the 
formulas. 

Several studies have investigated the effect of using 
readability formulas to guide text revision. An exercise in 
rewriting jury instructions demonstrated that the score of 
revised instructions on a readability measure had little to do 
with how well they were understood by jurors (Charrow & Charrow, 
1979) . 

A study by Davison, Kantor, Hannah, Hermon, Lutz, and 
Salzillo (1980) showed that adapting texts in the Science 
Research Associates Skillbuilders series to fit the formulas was 
not only ineffective, but, in many cases, actually increased the 
difficulty of the cext. For example, in a passage about trees, 
the sentence 
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If given a chance before another fire comes, the 
tree will heal its own wounds by growing new bark over 
the burned part. 

was changed to 

If given a chance before another fire comes, the 
tree will heal its own wounds. It will grow new bark 
over the burned part. 

The modified text contains shorter sentences, so aoeording to 
most readability formulas it should be easier to read. However, 
the reader must now make the inference that the new bark is the 
mechanism by which the tree heals its wounds without an explicit 
statement of this fact. Thus, the adapted text may actually be 
more difficult than the original. 

Criteria for Applicability 

The preceding examples illustrate various ways in which 
readability formulas yield faulty predictions, or even lead to 
the writing of passages which are harder to read. As a series of 
separate examples, they do not show why readability formulas fail 
nor do th^y distinguish among different situations in which the 
formulas might be more or less appropriate. in each case, 
however, ve can point to an assumption about the use of the 
formulas which has been violated. On the basis of these examples 
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/ 

\ of readability formula failure, then, we are led to the 
conclusion that the formulas are valid only JJ certain conditions 
hold* Int^testingly, similar lists of conditions have been put 
forth by designers of the formulas themselves. It is becoming 
increasingly clear that readal>irrrt^l*. .r^ulas should be us>ed only 
where the following criteria are met: 



1. Material may be freely read . Material like 
captioning for the deaf, which appears on the 
screen and then disappears *ifter a certain time^r 
cannot be freely read. The time spent on it is 
limited by external factors, not by choice of the 
reader • 

2. Text honestly written . The formulas assume that 
material is not written to satisfy the readability 
formulas, but rather to satisfy some other 
communicative goal. 

3* Higher-level text structures are irrelevant . The 
formulas assume that organizational material, 
information about intentions, goals, etc. need not 
be specifically taken into account. 

4. Purpose reading is irrelevant . Skimming, 

test-taking, reading for pleasure, and so on are 
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all taken to be equivalent in determining the 
readability of a passage. 

5. Statistical averages are meaningful in individual 
cases. Use of the formulas impl-es that 
statistical averages regarding both texts and 
readers can provide useful information regarding 
the appropriateness of an individual text for an 
individual person. 

6. Readers of interest are the same as the readers on 
*iil2ID the readability formula was validate d. Any 
attempt to expand the use of the formula to 
evaluate materials for readers whose background, 
dialect, purpose in reading, etc. differs from 
those of the readers used in validation is likely 
to lead to difficulties. 



Unfortunately, it appears that not only some, but nearly 
all, uses of readability formulas violate the basic assumptions 
on their applicability. Rigorous adherence to these assumptions 
effectively prevents use of readability formulas for TV 
captioning, adaptation, selection of texts for readers of 
different cultural backgrounds, designing special texts for 
children, selection of text passages, choosing trade books, or 
designing remedial readers, and restricts readability formula use 
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to trivial cases of little import for educational or social 
policy. 

We are left w' :h a question: Are there any areas in which 
the assumptions about che readability formulas are satisfied and 
the formulas improve on intuitive estimates of the readability of 
a text? We think not. The real factors that affect readability 
are elements such as the background knowledge ot the reader 
relative to the knowledge presumed by the writer, the purpose of 
the reader relative to the purpose of the writer, and the purpose 
of the person who is presenting the text to the reader. These 
factors cannot be captured in a simple formula and ignoring them 
may do more harm than good. 
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